feat: add GCS support as file provider (#2013) #2117

Suryakantdsa · 2025-08-02T18:44:12Z

Pull Request Template

Description:

Added GCS file provider in pkg/datasource/file/gcs .
Implemented Create , Remove , ReadDir , Open ,Stat and MakeDir using cloud.google.com/go/storage .
Simulates directories using object key prefixes .

Checklist:

I have formatted my code using goimport and golangci-lint.
All new code is covered by unit tests.
This PR does not decrease the overall code coverage.
I have reviewed the code comments and documentation for clarity.

Fixes #2013

Suryakantdsa · 2025-08-03T18:09:13Z

hi @Umang01-hash , @coolwednesday ,

Please have a look at my PR , let me know if you notice anything that needs changes.

thanks !

coolwednesday · 2025-08-04T07:41:15Z

hi @Umang01-hash , @coolwednesday ,

Please have a look at my PR , let me know if you notice anything that needs changes.

thanks !

Hey!

Can you add Setup details for the ease of testing and documentation for users who would want to setup the same by referring documentation (when this feature would be out).

Suryakantdsa · 2025-08-04T13:52:02Z

sure @coolwednesday , I will add the setup guide and documentation to help the user to use this feature.

thanks for the feedback👍

Suryakantdsa · 2025-08-04T17:23:59Z

hey @coolwednesday ,

I have noticed that the setup details for file handling (S3,FTP/SFTP etc .) are already there in gofr/docs/advanced-guide/handling-file/page.md .

Just wanted to confirm , should I add a setup instruction for GCS in the same file or would you prefer this to be in a separate file ?

thanks..!

Suryakantdsa · 2025-08-09T07:09:48Z

Hey @coolwednesday ,

Could you please take a look at this above doubt ?

Suryakantdsa · 2025-08-11T06:56:03Z

hey @ccoVeille @Umang01-hash ,
Could you please take a look at this above doubt ?

coolwednesday

Please update the unified handling-files documentation to include a code snippet demonstrating how to add GCS support.

Additionally, make the necessary changes in the Container package so that it works like any other datasource. For reference, review PR #2076 and replicate the relevant changes.

After that, ensure traces, logs, and metrics are integrated (also refer to #2076 for implementation details).

Once implemented:

Add screenshots showing logs, traces, and metrics from the running cloud store.
Add setup documentation in the docs folder.
Update the navigation.js.
Include any Docker container commands used in contributing.md.

Suryakantdsa · 2025-08-12T15:54:17Z

Ok, got it. I’m working on it and will let you know once it’s completed.
@coolwednesday thanks..!

Suryakantdsa · 2025-08-17T07:44:47Z

hey @Umang01-hash , @coolwednesday ,

I have Updated the existing unified Handling File documentation (handling-file.md) to include GCS setup and usage (no new navigation entry added).
Included Docker container commands in contributing.md.

Here are some screenshots showing logs, traces, and metrics from the running cloud store.

Screenshots (click to expand)

Screenshots (click to expand)

Screenshots (click to expand)

Screenshots (click to expand)

Please let me know if you notice anything that needs changes.
Thank you so much..!

coolwednesday

The whole implementation other than this looks good to me.

pkg/gofr/datasource/file/gcs/fs_dir.go

akshat-kumar-singhal

Overall rewrite of file package should be considered before adding more implementations like GCS etc

docs/advanced-guide/handling-file/page.md

pkg/gofr/datasource/file/gcs/file.go

CONTRIBUTING.md

pkg/gofr/datasource/file/gcs/logger.go

pkg/gofr/datasource/file/gcs/interface.go

Suryakantdsa · 2025-09-03T10:01:54Z

Hi @Umang01-hash,

My branch was not up to date with gofr/development. I’ve rebased it and pushed the updates now.
please have a look.

thank you..!

Umang01-hash

Please remove all the go.mod changes from all the other datasources as they are not related to scope of your PR. Also the changes inside examples/using-add-filestore is also not required. Please revert all of these changes.
There are no tests written for entire newly added code but only some of it. Please make sure to write tests using mocks for the newly added code (feel free to refer other datasources to see how they use mocks for testing).
Only export errors, constants etc when needed, unexport all the ones not used outside the directory.

pkg/gofr/datasource/file/gcs/interface.go

pkg/gofr/datasource/file/gcs/go.mod

pkg/gofr/datasource/file/gcs/file.go

pkg/gofr/datasource/file/gcs/file_parse.go

Suryakantdsa · 2025-09-03T12:28:13Z

Hi @Umang01-hash,
I’ve noted the required changes you mentioned. I’m working on them and will update you once they’re completed.

thanks for your time..

Suryakantdsa · 2025-09-03T15:16:39Z

Hi @Umang01-hash,

I’ve made the required changes. Please have a look when you get a chance.

thank you..!

Umang01-hash

Hey @Suryakantdsa, Thankyou for making the requested changes.

After running the tests here is your current code coverage:

Please make sure that each file (except from mocks, interfaces) has atleast 75-80% code coverage.

Also in the example you provided in the documentation, the gcs Config:

&gcs.Config{
		BucketName:      "my-bucket",
		CredentialsJSON: readFile("gcs-credentials.json"),
		ProjectID:       "my-project-id",
	}

Endpoint field is missing from here.

Umang01-hash · 2025-09-16T08:43:12Z

@Suryakantdsa are you still working on this PR??

Suryakantdsa · 2025-09-26T10:04:10Z

Hi @Umang01-hash,
I was unavailable earlier due to some medical urgency, but I’m back to working on this PR now

…ions and fix linting issue

Suryakantdsa · 2025-10-13T12:54:47Z

Hi @Umang01-hash,

I have addressed the required changes.
Please take a look when you get a chance.

thanks !

Umang01-hash · 2025-10-15T10:30:00Z

@Suryakantdsa Thankyou for all the changes, the PR looks good now. Only the retry part is left for which i have created another issue since it can be handled in other PR.

akshat-kumar-singhal · 2025-10-15T12:22:24Z

@Umang01-hash I see few areas of improvement:

WriteAt, ReadAt etc are implemented as "Not Supported" - why do we need to have them unless we're embedding the ReaderAt/WriterAt interfaces in our interface and hence need to mandatorily implement it.
The metrics we're exposing has location i.e. file path as label value - hope we've considered the cardinality aspect here.
Metrics being exposed across AWS/GCP (and other providers later) should be consistent - consider having a single implementation in the main file package and restrict only the file transfer implementation in the respective packages. A unified metric set with a label for storage type (AWS/GCP/FTP etc) would allow a better visibility to the application developer as well as limit the number of new metrics/panels that'd be added.
We could look at the entire file system implementation as file operations + storage/retrieval, where the file operations are majorly common and can be implemented in the file package. The retrieval/storage is what can be implemented in the respective packages by FTP/SFTP/S3/GCS etc. Those implementations can have the additional check of version key/locking before updating the file on remote to ensure consistent writes.
Separate FileLog for GCS could be avoided - it should be available in the file package itself.
Definition of standard constants like SUCCESS, ERROR looks off - I understand that "Little copying is better than little dependency", we could however have these in the file package, which we'd already be importing in the various implementations
Definition of histogram buckets within the implementation package should be avoided.

akshat-kumar-singhal · 2025-10-15T12:22:55Z

@Umang01-hash I see few areas of improvement:

WriteAt, ReadAt etc are implemented as "Not Supported" - why do we need to have them unless we're embedding the ReaderAt/WriterAt interfaces in our interface and hence need to mandatorily implement it.

The metrics we're exposing has location i.e. file path as label value - hope we've considered the cardinality aspect here.

Metrics being exposed across AWS/GCP (and other providers later) should be consistent - consider having a single implementation in the main file package and restrict only the file transfer implementation in the respective packages. A unified metric set with a label for storage type (AWS/GCP/FTP etc) would allow a better visibility to the application developer as well as limit the number of new metrics/panels that'd be added.

We could look at the entire file system implementation as file operations + storage/retrieval, where the file operations are majorly common and can be implemented in the file package. The retrieval/storage is what can be implemented in the respective packages by FTP/SFTP/S3/GCS etc. Those implementations can have the additional check of version key/locking before updating the file on remote to ensure consistent writes.

Separate FileLog for GCS could be avoided - it should be available in the file package itself.

Definition of standard constants like SUCCESS, ERROR looks off - I understand that "Little copying is better than little dependency", we could however have these in the file package, which we'd already be importing in the various implementations

Definition of histogram buckets within the implementation package should be avoided.

cc @coolwednesday @gizmo-rt @Suryakantdsa

Umang01-hash · 2025-10-16T05:12:15Z

@Umang01-hash I see few areas of improvement:

WriteAt, ReadAt etc are implemented as "Not Supported" - why do we need to have them unless we're embedding the ReaderAt/WriterAt interfaces in our interface and hence need to mandatorily implement it.

The metrics we're exposing has location i.e. file path as label value - hope we've considered the cardinality aspect here.

Metrics being exposed across AWS/GCP (and other providers later) should be consistent - consider having a single implementation in the main file package and restrict only the file transfer implementation in the respective packages. A unified metric set with a label for storage type (AWS/GCP/FTP etc) would allow a better visibility to the application developer as well as limit the number of new metrics/panels that'd be added.

We could look at the entire file system implementation as file operations + storage/retrieval, where the file operations are majorly common and can be implemented in the file package. The retrieval/storage is what can be implemented in the respective packages by FTP/SFTP/S3/GCS etc. Those implementations can have the additional check of version key/locking before updating the file on remote to ensure consistent writes.

Separate FileLog for GCS could be avoided - it should be available in the file package itself.

Definition of standard constants like SUCCESS, ERROR looks off - I understand that "Little copying is better than little dependency", we could however have these in the file package, which we'd already be importing in the various implementations

Definition of histogram buckets within the implementation package should be avoided.

@akshat-kumar-singhal Thanks for your valuable feedback and review.

Let's try and address these comments one by one:

1. WriteAt, ReadAt etc are implemented as "Not Supported" - why do we need to have them unless we're embedding the ReaderAt/WriterAt interfaces in our interface and hence need to mandatorily implement it.

These methods must be implemented because File struct must satisfy the file.File interface:

// File represents a file in the filesystem.
type File interface {
	io.Closer
	io.Reader
	io.ReaderAt
	io.Seeker
	io.Writer
	io.WriterAt

	ReadAll() (RowReader, error)
}

which embeds io.ReaderAt, io.Seeker, and io.WriterAt. By implementing them with error returns, we maintain interface compliance while clearly indicating these operations aren't supported by GCS.

2. The metrics we're exposing has location i.e. file path as label value - hope we've considered the cardinality aspect here.

Looking at the implementation i found file paths (Location) are only included in debug logs via f.logger.Debug(fl), not in metrics. The metrics histogram only uses operation type and status as labels ("type", fl.Operation, "status", clean(fl.Status)), keeping cardinality manageable.

3. Metrics being exposed across AWS/GCP (and other providers later) should be consistent - consider having a single implementation in the main file package and restrict only the file transfer implementation in the respective packages. A unified metric set with a label for storage type (AWS/GCP/FTP etc) would allow a better visibility to the application developer as well as limit the number of new metrics/panels that'd be added.

4. We could look at the entire file system implementation as file operations + storage/retrieval, where the file operations are majorly common and can be implemented in the file package. The retrieval/storage is what can be implemented in the respective packages by FTP/SFTP/S3/GCS etc. Those implementations can have the additional check of version key/locking before updating the file on remote to ensure consistent writes.

I agree that a unified metrics system and better separation between common operations and provider-specific code would be beneficial. However, this PR is specifically focused on adding GCS integration to match existing functionality. The TODO comment already acknowledges the need for refactoring both implementations to use a common interface.

 
// TODO: Future improvement - Refactor both S3 and GCS implementations to use a
// common CloudStorageClient interface for better abstraction. This implementation
// currently follows the same pattern as S3 to maintain consistency with existing code.

Following single-responsibility principles for PRs, I propose to implement these architectural improvements in a dedicated follow-up PR. This would allow proper focus on the refactoring without mixing it with the GCS integration work, making both changes easier to review and more reliable.

5. Separate FileLog for GCS could be avoided - it should be available in the file package itself.

I agree that duplicating the FileLog structure across providers creates unnecessary redundancy. While consolidating this into a common structure in the file package would be beneficial, however, this would require coordinated changes across all storage providers (S3, SFTP, and GCS). For better change management and to maintain the focused scope of this PR (adding GCS integration), I propose addressing this in a separate follow-up PR that would properly refactor logging across all providers.

6. Definition of standard constants like SUCCESS, ERROR looks off - I understand that "Little copying is better than little dependency", we could however have these in the file package, which we'd already be importing in the various implementations

I agree that standardizing constants like statusSuccess and statusErr in the main file package would improve consistency and reduce duplication across storage implementations. Similar to the other architectural suggestions, this would be best addressed as part of a broader refactoring effort affecting all file storage implementations.

7. Definition of histogram buckets within the implementation package should be avoided.

I agree that hardcoding histogram bucket definitions within implementation packages creates inconsistency. Moving these definitions to the main file package would ensure standardized metrics across all storage implementations and should be included in our planned refactoring PR.

I have created this issue #2433 where we mentioned all these changes you suggested to imrpvoe the file package implementation, once we get GCS merged then before adding any new file store our core focus will be to refactor the current implementation for all file stores.

Umang01-hash · 2025-10-16T10:31:03Z

@Suryakantdsa @akshat-kumar-singhal An update here is I have made some changes to how metrics and logs are being handled by diff file systems :

Extracted logging and metrics functionality to the common file package.
Standardized constants like StatusSuccess and StatusError in the file package.
Moved histogram bucket definitions to the main file package for consistency

GCS implementation is now using these common components. You can check these in new PR : #2435.

I think now we can review and merge that and the next set of things to refactor in the file systems will be:

Make other file systems (S3, SFTP, FTP) to use these common logger and metrics component of file package.
Refactor the file system implementation to separate common operations from provider-specific storage/retrieval.
Create a unified interface that all providers (S3, GCS, FTP, etc.) can implement

The current changes have significantly improved code organization while maintaining compatibility with existing implementations. After merging this PR, we can focus on the deeper architectural refactoring in a separate PR to ensure proper separation of concerns and standardization across all storage providers.

Umang01-hash · 2025-10-16T10:47:57Z

@Suryakantdsa One thing that can be resolved is:

These methods (Seek, ReadAt, WriteAt) should be fully implemented for the GCS provider rather than returning "not supported" errors.

func (*File) Seek(_ int64, _ int) (int64, error) {
	// Not supported: Seek requires reopening with range.
	return 0, errSeekNotSupported
}

func (*File) ReadAt(_ []byte, _ int64) (int, error) {
	return 0, errReadAtNotSupported
}

func (*File) WriteAt(_ []byte, _ int64) (int, error) {
	return 0, errWriteAtNotSupported
}

While GCS (like S3) doesn't natively support random access operations, we can implement these methods using similar patterns to the S3 implementation:

For Seek(): Maintain an internal offset state and implement the standard seek behaviors (SEEK_SET, SEEK_CUR, SEEK_END)
For ReadAt(): Download the necessary portion of the file using GCS range requests
For WriteAt(): Download the entire file, modify the content at the offset, then re-upload the file

This would provide a consistent API across storage providers and allow users to use these methods regardless of the underlying storage system. Please follow the S3 implementation pattern with appropriate observability using the common file.OperationObservability helper.

Please proceed ahead with these changes here and i will take a update from here on the new PR #2435 .

Suryakantdsa · 2025-10-22T04:51:17Z

ok i will proceed with them

thanks @Umang01-hash

…dsa/gofr into feat/gcs-file-provider

Suryakantdsa · 2025-10-26T19:06:42Z

Hi @Umang01-hash ,

Implemented Seek, ReadAt, and WriteAt methods for GCS provider as required changes.
Please take a look when you get a chance.

Thanks!

Umang01-hash · 2025-10-29T04:48:01Z

@Suryakantdsa Thankyou for your contribution. Since we are now continuing the review and enhancements on the new PR #2435 I am closing this PR as of now. Will continue to target and resolve any review comments there and hope to merge this feature soon!

Suryakantdsa · 2025-10-29T13:02:55Z

Thanks for the update! 😊 @Umang01-hash

Kinda sad to see this one closed without merging 😅, but I totally get it.
Excited to see things moving forward in #2435!
Really appreciate the feedback and the chance to contribute — can’t wait to see it merged soon!

coolwednesday requested changes Aug 12, 2025

View reviewed changes

Suryakantdsa force-pushed the feat/gcs-file-provider branch from 8c79d6a to 7b1824a Compare August 16, 2025 19:06

Suryakantdsa requested a review from coolwednesday August 17, 2025 07:45

coolwednesday requested changes Aug 19, 2025

View reviewed changes

pkg/gofr/datasource/file/gcs/fs_dir.go Show resolved Hide resolved

coolwednesday added re-review reuired re-review required and removed re-review reuired labels Aug 19, 2025

Suryakantdsa requested a review from coolwednesday August 19, 2025 17:38

akshat-kumar-singhal suggested changes Aug 20, 2025

View reviewed changes

Umang01-hash added the gofr-soc label Aug 20, 2025

Suryakantdsa requested a review from akshat-kumar-singhal August 21, 2025 17:31

Suryakantdsa force-pushed the feat/gcs-file-provider branch from e6dcea0 to 22e4f09 Compare September 3, 2025 09:54

Umang01-hash requested changes Sep 3, 2025

View reviewed changes

Suryakantdsa force-pushed the feat/gcs-file-provider branch from 22e4f09 to 690b993 Compare September 3, 2025 14:45

Suryakantdsa requested a review from Umang01-hash September 3, 2025 14:54

Umang01-hash reviewed Sep 5, 2025

View reviewed changes

feat(file/gcs): add GCS support with Go 1.25 and fix requested changes

a47f699

Suryakantdsa added 4 commits October 13, 2025 17:48

docs(gcs): add EndPoint field in config example

02b872f

refactor(tests): remove if-else by splitting into separate test funct…

e2fb9b3

…ions and fix linting issue

fix: handle nil metrics in Connect

dfc13b4

fix: handle nil metrics in Connect

071e2f9

Suryakantdsa requested a review from Umang01-hash October 13, 2025 12:48

Umang01-hash added 2 commits October 14, 2025 09:44

Merge branch 'development' into feat/gcs-file-provider

f8643ce

Merge branch 'development' into feat/gcs-file-provider

f8e6fac

Umang01-hash previously approved these changes Oct 15, 2025

View reviewed changes

Umang01-hash mentioned this pull request Oct 16, 2025

feat: add GCS support as file provider (#2013) #2435

Open

4 tasks

Umang01-hash and others added 4 commits October 23, 2025 09:11

Merge branch 'development' into feat/gcs-file-provider

d708086

Merge branch 'gofr-dev:development' into feat/gcs-file-provider

e230f80

feat(gcs): implement Seek, ReadAt, and WriteAt methods for GCS provider

b9ebe29

Merge branch 'feat/gcs-file-provider' of https://github.com/Suryakant…

d8f6960

…dsa/gofr into feat/gcs-file-provider

Suryakantdsa dismissed Umang01-hash’s stale review via d8f6960 October 26, 2025 18:56

Suryakantdsa requested a review from Umang01-hash October 26, 2025 19:09

Umang01-hash added 3 commits October 28, 2025 11:24

Merge branch 'development' into feat/gcs-file-provider

5ba7a6a

Merge branch 'development' into feat/gcs-file-provider

9a54754

Merge branch 'development' into feat/gcs-file-provider

127b4d4

Umang01-hash closed this Oct 29, 2025

Uh oh!

feat: add GCS support as file provider (#2013) #2117

feat: add GCS support as file provider (#2013) #2117

Uh oh!

Conversation

Suryakantdsa commented Aug 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Template

Uh oh!

Suryakantdsa commented Aug 3, 2025

Uh oh!

coolwednesday commented Aug 4, 2025

Uh oh!

Suryakantdsa commented Aug 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Suryakantdsa commented Aug 4, 2025

Uh oh!

Suryakantdsa commented Aug 9, 2025

Uh oh!

Suryakantdsa commented Aug 11, 2025

Uh oh!

coolwednesday left a comment

Choose a reason for hiding this comment

Uh oh!

Suryakantdsa commented Aug 12, 2025

Uh oh!

Suryakantdsa commented Aug 17, 2025

Uh oh!

coolwednesday left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

akshat-kumar-singhal left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Suryakantdsa commented Sep 3, 2025

Uh oh!

Umang01-hash left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Suryakantdsa commented Sep 3, 2025

Uh oh!

Suryakantdsa commented Sep 3, 2025

Uh oh!

Umang01-hash left a comment

Choose a reason for hiding this comment

Uh oh!

Umang01-hash commented Sep 16, 2025

Uh oh!

Suryakantdsa commented Sep 26, 2025

Uh oh!

Suryakantdsa commented Oct 13, 2025

Uh oh!

Umang01-hash commented Oct 15, 2025

Uh oh!

akshat-kumar-singhal commented Oct 15, 2025

Uh oh!

akshat-kumar-singhal commented Oct 15, 2025

Uh oh!

Umang01-hash commented Oct 16, 2025

Uh oh!

Umang01-hash commented Oct 16, 2025

Uh oh!

Umang01-hash commented Oct 16, 2025

Uh oh!

Suryakantdsa commented Aug 2, 2025 •

edited

Loading

Suryakantdsa commented Aug 4, 2025 •

edited

Loading